move uberenv to new spack release #77

cyrush · 2016-10-26T20:34:16Z

there should be a new spack release before SC16, eval if we should move uberenv to this version.

cyrush · 2017-01-18T17:32:28Z

spack v0.10.0 was released on 2017-01-17

cyrush · 2017-03-15T22:23:04Z

In addition to testing packages, this requires reworking our compilers.yaml and adding support for external packages for via packages.yaml. In our current setup we use SYS_TYPEs to easily add compiler specifics for llnl machines, while still allowing generic defaults for other linux platforms.
We need to understand how to recreate that setup with the new compilers.yaml format, and hopefully do the same for external packages (ex mpi) in packages.yaml

cyrush · 2017-08-09T23:44:51Z

Thoughts + Notes on upgrading our developer TPL build process to a newer version of spack.

These notes apply not only to Conduit, but also Axom and Alpine (https://github.com/Alpine-DAV/) , since they use the same spack-based TPL build strategy. Since Axom and Alpine both depend on Conduit, we expect to flush out the upgrade process in Conduit, and then propagate it to those projects. Additionally, there are other LLNL codes that have similar requirements -- even if they aren't using uberenv.

Our ultimate goal is a sharable & reproducible TPL build process.

Once someone has paved the way for building TPLs on a platform (eg: a specific HPC cluster or OSX) with a given set of compilers, other team members should be able to easily replicate this. Easily here means minimally selecting which compiler to use, and possibly some feature variants -- the platform should automatically be detected.

Here are a few key things we want to do in support of this goal:

We want to explicitly specify compiler paths and prevent any settings in user's environment from undermining the process.
Any system dependencies also need to be tied to a specific platform and compiler.
For example, we need to select a blessed MPI or CUDA install for each platform and compiler. (ex: gcc 4.9x on toss3 will use mvapich-zzz located at /here)
After installation, the TPL installs need to be shareable, guarded via standard file system perms. Any user can build a shareable set TPLs, we don't rely on a magic user to handle builds.

Towards this goal, spack does the heavy lifting for us -- however we also use a small python script named uberenv (https://github.com/LLNL/conduit/blob/master/scripts/uberenv/uberenv.py) to automate the process.

uberenv is a thin veneer around spack that helps automate TPL builds for Axom, Conduit, or Alpine.

Here is what uberenv currently does:

-checks out spack from github (a specific hash, selected by an entry in a json file)

-optionally helps setup a spack mirror in a shared location

-patches spack to disable any user or system settings related to compilers

-copies in a compilers.yaml file with blessed compilers

-patches spack to limit max number of build jobs to 8
(when python's multiproc lib reports 48 CPUs, make -j 48 breaks many autoconf TPL builds)

-copies a set of custom spack packages over the built-in spack package repo files
(this allows us to add new packages and customize or override the default logic)

-launches the spack build of a special "uberenv-zzz" package for a specific spack spec

This special "uberenv-zzz" package specifies all of the dependencies that are needed to develop the desired software project zzz (eg: Conduit, Axom, etc)

It does not build these software projects, instead it generates a file that can be used to locate the compilers and all of the TPLs that spack built. This file is called a "host-config" file b/c we use the host name in the names of these files. The host config file contents are used as CMake initial cache file that the build systems of all of these projects support. We revision control these files, so that anyone on a system with a shared TPL install can use this file to bootstrap a build.

Long term, we hope to simplify uberenv and rely more on spack features to achieve the same process.

The version of spack we are using is quite old, but to update we need to address a few issues:

The version we are using allows us to easily craft a single compilers.yaml file that outlines details for a wide range of systems. We can do so generally (say for Linux or OSX) and provide more specific options for a known HPC cluster (we are using LLNL's SYS_TYPE var).

For a concrete example, see: https://github.com/LLNL/conduit/blob/master/scripts/uberenv/compilers.yaml

Newer versions of spack support different naming schemes to identify platforms.
I don't believe they support SYS_TYPE. So we have desktop linux systems that are running rhel-x-y etc, and those appear to be the same as an HPC cluster that happens to be using the same version of rhel, even though they are drastically different systems.

We would even be happy with host-name based solutions. But we need to understand what is supported, and how we can craft our new compilers.yaml file(s).

We want to use external packages for a very small set of TPLs.

The version we are using lacks support for external packages.

In our automated build process for Conduit and Alpine on LLNL clusters, we rely on the proper MPI and CUDA being exposed in a user's PATH. (MPI via looking for mpicc, CUDA via looking for nvcc).

Axom's TPLs don't require MPI yet. In our automated build process for Axom TPLs, we have manually created files that augment the spack generated host config to enable MPI. These manual edits files are revison controlled, and provide per platform + compiler paths to MPI.

In both cases, we want to get away from our current solutions and instead use external package support via packages.yaml.

With a newer version of spack, I don't know what the correct path to use packages.yaml to select a specific MPI for a given platform + compiler. We need to discuss what is supported.

We could manage the platform specifics via uberenv spack tweaks (forcing a specific compilers.yaml + packages.yaml file based on SYS_TYPE), but we would like to do as much as possible using spack features.

We would like spack command line options that allow us to specify a "compilers.yaml" and "packages.yaml" file.

When these are passed, spack needs to use these and ignore any other user or system level spack settings. When this exists, we can remove the uberenv step that patches spack to disable user settings.

We would like a spack command line option to limit the max build jobs, to replace our current patch. (Perhaps this already exists?)

In some cases, errors that occur when building packages aren't captured in log files.
Todd has a reproducer. The problem has to be with how spack captures the standard output streams in the install env.

Unfortunately -- this happens in LLNL batch jobs, which are a very important case for our automated builds. I have also seen it a CI setting where I was building a docker container inside of a docker container.

This is a big issue b/c when things go wrong, we have to spawn another build by hand (outside of a batch job) to try to tackle the error.

How do we move our custom packages forward?

There are two issues here:

a.

Our current packages are tested heavily and frequently on LLNL's HPC clusters, and we had to hardened them against the build horrors we experienced (for example on BG/Q). New packages have not been exposed to the same vetting. There will be many issues when we upgrading.

b.
We have concerns with the complexity related to how optional dependencies have evolved in spack packages. Our goal for development is a minimal set of deps to develop our projects. As more deps are added to packages,the build process can get undermined by packages that we really don't need. Some policies on variants and default builds could help, but I don't think we will get agreement among the diverse set of spack developers and users.

We can address this with our own packages, but we want to use as many off the shelf packages as possible.

cyrush · 2017-08-21T16:36:19Z

the output issue (5 above) is now resolved in spack develop, w/ the merge of https://github.com/LLNL/spack/pull/5084

cyrush · 2017-09-07T21:30:42Z

(4 above), there is a config option in new version of spack to limit the # of build tasks.

cyrush · 2017-09-07T22:03:05Z

We meet with @becker33, here are a few near term things we will work in:

command line options for all yaml files (config, compilers, package, etc)
examples of how to wield new config, compilers, and package yaml files
- including how to use a env var to select target (to support SYS_TYPE magic)
variant forwarding
active on install (for python)
periodic testing of collections of packages vs blessed specs

cyrush · 2017-12-21T23:53:26Z

this is complete with #225, which enhanced uberenv to fill the gaps needed for us to update to a newer spack. However it still outlines our wishlist for spack support that would simplify what uberenv requires.

cyrush added the task label Oct 26, 2016

cyrush mentioned this issue Sep 25, 2017

Task/move vtk h out Alpine-DAV/ascent#41

Merged

cyrush added this to the 0.3.1 milestone Oct 16, 2017

cyrush closed this as completed Feb 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move uberenv to new spack release #77

move uberenv to new spack release #77

cyrush commented Oct 26, 2016

cyrush commented Jan 18, 2017

cyrush commented Mar 15, 2017

cyrush commented Aug 9, 2017 •

edited

Loading

cyrush commented Aug 21, 2017

cyrush commented Sep 7, 2017

cyrush commented Sep 7, 2017

cyrush commented Dec 21, 2017

move uberenv to new spack release #77

move uberenv to new spack release #77

Comments

cyrush commented Oct 26, 2016

cyrush commented Jan 18, 2017

cyrush commented Mar 15, 2017

cyrush commented Aug 9, 2017 • edited Loading

cyrush commented Aug 21, 2017

cyrush commented Sep 7, 2017

cyrush commented Sep 7, 2017

cyrush commented Dec 21, 2017

cyrush commented Aug 9, 2017 •

edited

Loading