Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LBANN #3049

Merged
merged 10 commits into from
Feb 13, 2017
Merged

LBANN #3049

merged 10 commits into from
Feb 13, 2017

Conversation

bvanessen
Copy link
Contributor

Recipe for building Livermore Big Artificial Neural Network (LBANN) training toolkit.

Artificial Neural Network) training toolkit.
is optimized for building with GNU gcc and OpenBLAS.


class Lbann(CMakePackage):
"""LBANN: Livermore Big Artificial Neural Network Toolkit. A distributed memory, HPC-optimized, model and data parallel training toolkit for deep neural networks."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some new lines are needed to fit 80char limit.


variant('shared', default=True,
description='Enables the build of shared libraries')
variant('int64', default=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think those belong to this package at all. You should be able to do
spack install lbann ^elemental~int64 and alike, if you need to control it.

depends_on('cmake', type='build')

# Currently required to allow lbann to specify a specific BLAS library
depends_on('blas')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this package does not use blas/lapack, please remove.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify a specific BLAS library

you can do this easily with setting packages.yaml or from the CLI (a bit more involved).

depends_on('scalapack')

depends_on('elemental +openmp_blas +scalapack')
depends_on('elemental +openmp_blas +scalapack +int64', when='+int64')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these three are not needed, see comment above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spack install lbann ^elemental +int64 might suffice, but it may make sense to expose that as a single int64 option on lbann. I think which you choose really depends on whether building with int64 elemental changes LBANN's API. If it doesn't, I'd keep the options exclusively on elemental. If it does, I'd expose it so that other packages that need to can depend on lbann with the int64 API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgamblin This gets into the variant forwarding discussion.

If we let this case slide, then why not add +pic to every single package in Spack that depends on zlib? Obviously you wouldn't want that, and you can just do hdf5 ^zlib+pic if you needed to.

Another reason to avoid this is that variant forwarding is currently broken (cough cough).

Maybe it would be better to allow variants to appear anywhere on the command line, so that foo +mpi means activate the mpi variant for every package in the spec that has one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OT:

foo +mpi means activate the mpi variant for every package in the spec that has one.

that may not be appropriate if mpi-parallel code wants to use a serial version (non-mpi) of another package.

Luckily @bvanessen adjusted the package so we don't have to debate on this topic here 😄

variant('int64_blas', default=False,
description='Elemental: use 64bit integers for BLAS')

depends_on('cmake', type='build')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line isn't necessary. It's already in the CMakePackage base class.

Copy link
Member

@davydden davydden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

depends_on('protobuf@3.0.2')

def build_type(self):
"""Returns the correct value for the ``CMAKE_BUILD_TYPE`` variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's documented in the base class, for brevity you may remove documentation

@bvanessen
Copy link
Contributor Author

bvanessen commented Feb 7, 2017 via email

is required for supporting indices for large matrices.
This is slow, but provides an initialization that is independent of
model parallelism.
@davydden
Copy link
Member

davydden commented Feb 8, 2017

btw, i think

@property
     def elemental_libs(self):
         shared = True if '+shared' in self.spec else False
         return find_libraries(
             ['libEl'], root=self.prefix, shared=shared, recurse=True
         )

would not be needed after #1875 is merged. You would just need to set a custom name by adding libs = libEl to the package.
would need to be renamed into

@property
def libs(self):

after #1875 is merged

@citibeth
Copy link
Member

citibeth commented Feb 8, 2017

would need to be renamed into..

With or without #1875, I would also simplify the code:

    @property
     def elemental_libs(self):
         return find_libraries(
             ['libEl'], recurse=True, root=self.prefix,
             shared=('+shared' in self.spec))

@davydden
Copy link
Member

davydden commented Feb 8, 2017

@bvanessen i would probably rename it straightaway to libs and add @citibeth 's suggestion so that it's all good from the beginning.

Copy link
Member

@citibeth citibeth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please remove build_type(), it isn't used.
  2. Please remove the debug variant, it isn't used either. If you want elemental to be +debug, set that in packages.yaml.

version('0.91', '83b0ec9cd0b7625d41dfb06d2abd4134')

variant('debug', default=False,
description='Builds a debug version of the libraries')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug variant isn't used, can it be deleted? Soon, CMakePackage will do this automatically.

if '+debug' in self.spec:
return 'Debug'
else:
return 'Release'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_type() isn't used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's part of CMakePackage. It sets the build type for the package.

@bvanessen
Copy link
Contributor Author

bvanessen commented Feb 10, 2017 via email

@citibeth
Copy link
Member

Yes, and as a result it triggers canker to build without optimization and with debugging symbols.

What is canker? I don't see it mentioned anywhere.

@bvanessen
Copy link
Contributor Author

bvanessen commented Feb 10, 2017 via email

@citibeth
Copy link
Member

citibeth commented Feb 10, 2017 via email

@bvanessen
Copy link
Contributor Author

bvanessen commented Feb 10, 2017 via email

@citibeth
Copy link
Member

OK thanks, I get it. build_type() is used by CMakePackage; so it is still used, even though it's not called explicitly.

Hopefully we will soon get rid of the need to do this for CMakePackage, and automatically add debug to all CMakePackage packages (same idea as in #2380). Until that time, I suppose that build_type() is a necessary evil.

I'm still requesting forwarding of the debug variant should be removed. I.e. we should change this:

 depends_on('elemental +openmp_blas +scalapack +shared +int64')
  depends_on('elemental +openmp_blas +scalapack +shared +int64 +debug', when='+debug')

to this:

 depends_on('elemental +openmp_blas +scalapack +shared +int64')

Reasons:

  1. Variant forwarding like this is ad-hoc. With 1000+ packages, we need something more systematic. If we have variant forwarding randomly scattered through the repo, that will start to cause unusual/unexpected consequences as users combine packages in unexpected ways, and the variants don't work the way they expect.

  2. This is an issue we've discussed a lot. See Variant Forwarding Proposal #2594 enhancement proposal : variant forwarding #391 Re-work shared vs. static builds (Universal Variants) #2492 Allow setting default variants #2644. Currently, variant forwarding is used only in a few cases, I believe involving virtual dependencies. It is not generally used to forward standard variants to standard packages.

  3. You can already do what you need, with the current infrastructure, in one of two ways:
    a) Add to your packages.yaml:

packages:
    lbann:
        variants: [+debug]
    elemental:
        variants: [+debug]
b) Set the `debug` variant everywhere it's possible, with:
packages:
    all:
        variants: [+debug]

Copy link
Member

@citibeth citibeth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the line:

depends_on('elemental +openmp_blas +scalapack +shared +int64 +debug', when='+debug')

This is not a case where variant forwarding should be used.

Copy link
Member

@citibeth citibeth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the line:

depends_on('elemental +openmp_blas +scalapack +shared +int64 +debug', when='+debug')

This is not a case where variant forwarding should be used.

@bvanessen
Copy link
Contributor Author

bvanessen commented Feb 10, 2017 via email

@citibeth
Copy link
Member

citibeth commented Feb 10, 2017 via email

@bvanessen
Copy link
Contributor Author

bvanessen commented Feb 10, 2017 via email

@citibeth
Copy link
Member

Thank you for the explanation. My final conclusion is that the variant forwarding should be removed, for reasons already cited above. The case of LBANN /Elemental seems no different from that of many other packages, where one needs to set +debug simultaneously on multiple packages. Spack maintainers might feel free to disagree, of course.

Please remember that Spack is evolving rapidly; we are thinking not just about how things are now, but about how today's PRs fit into tomorrow's planned work. We are planning on implementing #2380 (for shared/static and debug/release) as soon as #2386 is merged.

We've seen attempts to forward debug flags in the past. Some people try forwarding top-down as you did; others want to forward bottom-up (i.e. LBANN would not have a +debug variant; but it would build debug or not, based on the setting of Elemental's +debug variant). Adding top-down and bottom-up forwarding of +debug at random points in the DAG can only serve to gum up the works. Eventually, someone would have to spend a few hours sorting it out in the future, probably removing all variant forwarding of this type.

With future PRs in mind, I would welcome suggestions you might have on how most conveniently set up debug builds in a general way. Right now, we have ways to set variants either for a single package, or for all packages in a DAG. What is the best way to conveniently configure debug variant for SOME packages in a DAG; for example, for the packages YOU are interested in building debug? I have no idea, but I'm open to suggestions.

In the meantime, asking LBANN users to set two +debug flags in packages.yaml instead of one does not seem like a very large imposition. Or they can set +debug for all packages, which requires only one setting.

I though that the purpose of the spack package was to assemble all of the dependencies in coherent way so that the end user does not have to micromanage an installation.

Spack is great, but it's not magic. Before setting up an install, I typically scrutinize the output of spack spec -Il, checking that versions and variants are what I intend. For my project requiring ~100 packages, I've found I need to specify variants or versions on maybe 10% of them (see below). This is a far cry from micromanaging each one of 100 packages.

You might find our work on Spack Environments to be relevant. A Spack Environment might be what you will want to distribute to your users: one that sets up for debug builds of your project, and one that sets up for production builds. Our notes so far are here:
https://github.com/LLNL/spack/wiki/Spack-Environments-WG


I'm sharing below my packages.yaml, where I've accumulated a number of version or variant specifications.

packages:
    # --------- Base Libraries
    parallel-netcdf:
        variants: [~cxx]
    eigen:
        # See http://eigen.tuxfamily.org/bz/show_bug.cgi?id=1370
        version: [3.2.10]
        variants: [~suitesparse]
    netcdf:
        version: [4.4.0]
        variants: [+mpi]
    # Required for NetCDF 4.4.0
    # See https://github.com/LLNL/spack/issues/3056
    hdf5:
        version: [1.8.18]
    openmpi:
        version: [1.10.3]

    # --------- Our Stuff
    modele-utils:
        version: [cmake]
    ibmisc:
        variants: [+everytrace]
    icebin:
        variants: [+gridgen,+python,+everytrace]
    pism:
        # Enables IceBin support code in PISM; no additional dependencies
        variants: [+icebin,+everytrace]
    modele:
        variants: [+pnetcdf,+everytrace]

    # -------- Python System
    python:
        version: [3.5.2]
    py-cython:
        version: [0.23.5]
    py-proj:
        version: [1.9.5.1.1]    # Normal released version 1.9.5.1 is buggy
    py-matplotlib:
        variants: [+gui,+ipython]
    py-numpy:
        variants: [+blas,+lapack]
    py-pyside:
        version: [1.2.4]

    # --------- Compiler & Virtual Dependencies
    all:
        compiler: [gcc@4.9.3]
        providers:
            mpi: [openmpi]
            blas: [openblas]
            lapack: [openblas]

I have another packages.yaml that I overlay on top of this one, that sets me to the develop version on key packages in my DAG, plus adding +debug flags where needed:

packages:
    icebin:
        version: [develop]

    ibmisc:
        version: [develop]

    pism:
        version: [dev]

    modele:
        version: [landice]
        variants: [+debug,+mods]

    py-giss:
        version: [develop]

    modele-control:
        version: [develop]

    everytrace:
        version: [develop]

I also have one that re-uses key system-installed packages:

# Use system version of certain packages on CentOS 7
#
# Try:
#     yum list installed | grep <package>
#     repoquery -l <package> | grep <expected filename>

packages:
    # Problems in the past installing Python2
    # See: https://github.com/LLNL/spack/issues/1100
    # yum: python
    python:
        paths:
            python@2.7.5: /usr
        version: [2.7.5]


    # Recommended for security reasons
    # Do not install OpenSSL as non-root user.
    # yum: openssl-devl
    openssl:
        paths:
            openssl@1.0.1e: /usr
        version: [1.0.1e]
        buildable: False

    # Recommended, unless your system doesn't provide Qt4
    # yum: qt
    qt:
        paths:
            qt@4.8.5: /usr
        version: [4.8.5]
        buildable: False

    # Recommended, unless your system doesn't provide X11 libs
    # yum: libXaw-devel
    libxaw:
        paths:
            libxaw@1.0.12: /usr
        version: [1.0.12]
        buildable: False

Copy link
Member

@becker33 becker33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@citibeth I appreciate your concern about variant forwarding. I think it's something that we should approach carefully, and should reconsider after the new concretizer that Todd is working on is in place. However, I think this is a compelling case for using forwarding. The semantics of it make sense -- lbann with debugging is useless when elemental does not have debug symbols.

I think this PR should go ahead as is. I will merge it when the tests return success, assuming no other change is made.

@tgamblin
Copy link
Member

tgamblin commented Feb 11, 2017

I think all of @citibeth's advice is useful but I'm with @becker33 on similar grounds to our decision on conduit in #2670. This package isn't widely used, and we're trying to give the actual maintainers of LBANN some leeway here to get them into Spack and to best serve their users. I would really like to address making variants more sane after we I get the concretizer done.

@citibeth
Copy link
Member

Please see #3131 for additional comments I have on this issue.

@citibeth citibeth mentioned this pull request Feb 12, 2017
2 tasks
@becker33 becker33 merged commit 56952aa into spack:develop Feb 13, 2017
diaena pushed a commit to diaena/spack that referenced this pull request May 26, 2017
* Creating a spack package for LLNL's LBANN (Livermore Big
Artificial Neural Network) training toolkit.

* Recipe for building LBANN toolkit.  Contains limited feature set and
is optimized for building with GNU gcc and OpenBLAS.

* Removed unnecessary dependencies based on reviewers feedback.

* Added support for the int64 data type in the Elemental library.  This
is required for supporting indices for large matrices.

* Added a variant to force a sequential weight matrix initialization.
This is slow, but provides an initialization that is independent of
model parallelism.

* Added a guard to prevent building Elemental with the Intel compiler
for versions that have known bugs.
xavierandrade pushed a commit to xavierandrade/spack that referenced this pull request Jun 16, 2017
* Creating a spack package for LLNL's LBANN (Livermore Big
Artificial Neural Network) training toolkit.

* Recipe for building LBANN toolkit.  Contains limited feature set and
is optimized for building with GNU gcc and OpenBLAS.

* Removed unnecessary dependencies based on reviewers feedback.

* Added support for the int64 data type in the Elemental library.  This
is required for supporting indices for large matrices.

* Added a variant to force a sequential weight matrix initialization.
This is slow, but provides an initialization that is independent of
model parallelism.

* Added a guard to prevent building Elemental with the Intel compiler
for versions that have known bugs.
amklinv pushed a commit that referenced this pull request Jul 17, 2017
* Creating a spack package for LLNL's LBANN (Livermore Big
Artificial Neural Network) training toolkit.

* Recipe for building LBANN toolkit.  Contains limited feature set and
is optimized for building with GNU gcc and OpenBLAS.

* Removed unnecessary dependencies based on reviewers feedback.

* Added support for the int64 data type in the Elemental library.  This
is required for supporting indices for large matrices.

* Added a variant to force a sequential weight matrix initialization.
This is slow, but provides an initialization that is independent of
model parallelism.

* Added a guard to prevent building Elemental with the Intel compiler
for versions that have known bugs.
healther pushed a commit to electronicvisions/spack that referenced this pull request Jul 26, 2017
* Creating a spack package for LLNL's LBANN (Livermore Big
Artificial Neural Network) training toolkit.

* Recipe for building LBANN toolkit.  Contains limited feature set and
is optimized for building with GNU gcc and OpenBLAS.

* Removed unnecessary dependencies based on reviewers feedback.

* Added support for the int64 data type in the Elemental library.  This
is required for supporting indices for large matrices.

* Added a variant to force a sequential weight matrix initialization.
This is slow, but provides an initialization that is independent of
model parallelism.

* Added a guard to prevent building Elemental with the Intel compiler
for versions that have known bugs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants