-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LBANN #3049
LBANN #3049
Conversation
Artificial Neural Network) training toolkit.
is optimized for building with GNU gcc and OpenBLAS.
|
||
|
||
class Lbann(CMakePackage): | ||
"""LBANN: Livermore Big Artificial Neural Network Toolkit. A distributed memory, HPC-optimized, model and data parallel training toolkit for deep neural networks.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some new lines are needed to fit 80char limit.
|
||
variant('shared', default=True, | ||
description='Enables the build of shared libraries') | ||
variant('int64', default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think those belong to this package at all. You should be able to do
spack install lbann ^elemental~int64
and alike, if you need to control it.
depends_on('cmake', type='build') | ||
|
||
# Currently required to allow lbann to specify a specific BLAS library | ||
depends_on('blas') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this package does not use blas
/lapack
, please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specify a specific BLAS library
you can do this easily with setting packages.yaml
or from the CLI (a bit more involved).
depends_on('scalapack') | ||
|
||
depends_on('elemental +openmp_blas +scalapack') | ||
depends_on('elemental +openmp_blas +scalapack +int64', when='+int64') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these three are not needed, see comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spack install lbann ^elemental +int64
might suffice, but it may make sense to expose that as a single int64
option on lbann
. I think which you choose really depends on whether building with int64
elemental changes LBANN's API. If it doesn't, I'd keep the options exclusively on elemental
. If it does, I'd expose it so that other packages that need to can depend on lbann
with the int64
API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tgamblin This gets into the variant forwarding discussion.
If we let this case slide, then why not add +pic
to every single package in Spack that depends on zlib
? Obviously you wouldn't want that, and you can just do hdf5 ^zlib+pic
if you needed to.
Another reason to avoid this is that variant forwarding is currently broken (cough cough).
Maybe it would be better to allow variants to appear anywhere on the command line, so that foo +mpi
means activate the mpi
variant for every package in the spec that has one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OT:
foo +mpi means activate the mpi variant for every package in the spec that has one.
that may not be appropriate if mpi
-parallel code wants to use a serial version (non-mpi
) of another package.
Luckily @bvanessen adjusted the package so we don't have to debate on this topic here 😄
variant('int64_blas', default=False, | ||
description='Elemental: use 64bit integers for BLAS') | ||
|
||
depends_on('cmake', type='build') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line isn't necessary. It's already in the CMakePackage base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
depends_on('protobuf@3.0.2') | ||
|
||
def build_type(self): | ||
"""Returns the correct value for the ``CMAKE_BUILD_TYPE`` variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's documented in the base class, for brevity you may remove documentation
Please note that this package also depends on the other pull request that I submitted to add a new version for protobuf #3046.
Brian C. Van Essen
vanessen1@llnl.gov
(w) 925-422-9300
(c) 925-290-5470
… On Feb 7, 2017, at 9:31 AM, Denis Davydov ***@***.***> wrote:
@davydden approved this pull request.
LGTM
In var/spack/repos/builtin/packages/lbann/package.py:
> + variant('debug', default=False,
+ description='Builds a debug version of the libraries')
+ variant('gpu', default=False,
+ description='Builds with support for GPUs via CUDA and cuDNN')
+ variant('opencv', default=True,
+ description='Builds with support for image processing routines with OpenCV')
+
+ depends_on('elemental +openmp_blas +scalapack +shared')
+ depends_on('elemental +openmp_blas +scalapack +shared +debug', when='+debug')
+ depends_on('cuda', when='+gpu')
+ depends_on('mpi')
+ ***@***.***', when='+opencv')
+ ***@***.***')
+
+ def build_type(self):
+ """Returns the correct value for the ``CMAKE_BUILD_TYPE`` variable
it's documented in the base class, for brevity you may remove documentation
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
is required for supporting indices for large matrices.
This is slow, but provides an initialization that is independent of model parallelism.
btw, i think
after #1875 is merged |
With or without #1875, I would also simplify the code:
|
@bvanessen i would probably rename it straightaway to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please remove build_type(), it isn't used.
- Please remove the debug variant, it isn't used either. If you want
elemental
to be+debug
, set that inpackages.yaml
.
version('0.91', '83b0ec9cd0b7625d41dfb06d2abd4134') | ||
|
||
variant('debug', default=False, | ||
description='Builds a debug version of the libraries') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debug
variant isn't used, can it be deleted? Soon, CMakePackage
will do this automatically.
if '+debug' in self.spec: | ||
return 'Debug' | ||
else: | ||
return 'Release' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build_type()
isn't used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's part of CMakePackage
. It sets the build type for the package.
Yes, and as a result it triggers canker to build without optimization and with debugging symbols.
Brian Van Essen
vanessen1@llnl.gov
(w) 925-422-9300
(c) 925-290-5470
________________________________
From: Adam J. Stewart <notifications@github.com>
Sent: Thursday, February 9, 2017 6:09:26 PM
To: LLNL/spack
Cc: Van Essen, Brian C.; Mention
Subject: Re: [LLNL/spack] LBANN (#3049)
@adamjstewart commented on this pull request.
________________________________
In var/spack/repos/builtin/packages/lbann/package.py<#3049>:
+ description='Builds with support for image processing routines with OpenCV')
+ variant('seq_init', default=False,
+ description='Force serial initialization of weight matrices.')
+
+ depends_on('elemental +openmp_blas +scalapack +shared +int64')
+ depends_on('elemental +openmp_blas +scalapack +shared +int64 +debug', when='+debug')
+ depends_on('cuda', when='+gpu')
+ depends_on('mpi')
+ depends_on('opencv@2.4.13', when='+opencv')
+ depends_on('protobuf@3.0.2')
+
+ def build_type(self):
+ if '+debug' in self.spec:
+ return 'Debug'
+ else:
+ return 'Release'
It's part of CMakePackage. It sets the build type for the package.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#3049>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AF7Ce_pj3RBAcHHvOxJT58Vu-8iEryQ2ks5ra8bWgaJpZM4L5LVt>.
|
What is canker? I don't see it mentioned anywhere. |
cmake. Bad auto correct.
Brian C. Van Essen
vanessen1@llnl.gov
(w) 925-422-9300
(c) 925-290-5470
… On Feb 10, 2017, at 5:37 AM, Elizabeth Fischer ***@***.***> wrote:
Yes, and as a result it triggers canker to build without optimization and with debugging symbols.
What is canker? I don't see it mentioned anywhere.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
On Fri, Feb 10, 2017 at 2:27 AM, Brian Van Essen ***@***.***> wrote:
Yes, and as a result it triggers canker to build without optimization and
with debugging symbols.
Which package are you talking about? I'm trying to understand the problem,
because I suspect there's a more Spack-compatible way to do this.
|
When the cmake build environment for LBANN is set to debug, the LBANN build will change its cpp flags to disable optimizations and build with symbols. So, while it does not have an explicit impact in the spack package, it does change how then code is built.
Brian C. Van Essen
vanessen1@llnl.gov
(w) 925-422-9300
(c) 925-290-5470
… On Feb 10, 2017, at 8:31 AM, Elizabeth Fischer ***@***.***> wrote:
On Fri, Feb 10, 2017 at 2:27 AM, Brian Van Essen ***@***.***>
wrote:
> Yes, and as a result it triggers canker to build without optimization and
> with debugging symbols.
>
Which package are you talking about? I'm trying to understand the problem,
because I suspect there's a more Spack-compatible way to do this.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
OK thanks, I get it. Hopefully we will soon get rid of the need to do this for I'm still requesting forwarding of the
to this:
Reasons:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the line:
depends_on('elemental +openmp_blas +scalapack +shared +int64 +debug', when='+debug')
This is not a case where variant forwarding should be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the line:
depends_on('elemental +openmp_blas +scalapack +shared +int64 +debug', when='+debug')
This is not a case where variant forwarding should be used.
Building lbann with debugging and elemental without debugging creates a nonsensical binary and relying on user based preferences seems like a recipe for inconsistent environments. Until variant forwarding is properly implemented I would prefer to make sure that elemental is correctly built in a debug environment.
Brian Van Essen
vanessen1@llnl.gov
(w) 925-422-9300
(c) 925-290-5470
…________________________________
From: Elizabeth Fischer <notifications@github.com>
Sent: Friday, February 10, 2017 8:59:05 AM
To: LLNL/spack
Cc: Van Essen, Brian C.; Mention
Subject: Re: [LLNL/spack] LBANN (#3049)
OK thanks, I get it. build_type() is used by CMakePackage; so it is still used, even though it's not called explicitly.
Hopefully we will soon get rid of the need to do this for CMakePackage, and automatically add debug to all CMakePackage packages (same idea as in #2380<#2380>). Until that time, I suppose that build_type() is a necessary evil.
I'm still requesting forwarding of the debug variant should be removed. I.e. we should change this:
depends_on('elemental +openmp_blas +scalapack +shared +int64')
depends_on('elemental +openmp_blas +scalapack +shared +int64 +debug', when='+debug')
to this:
depends_on('elemental +openmp_blas +scalapack +shared +int64')
Reasons:
1. Variant forwarding like this is ad-hoc. With 1000+ packages, we need something more systematic. If we have variant forwarding randomly scattered through the repo, that will start to cause unusual/unexpected consequences as users combine packages in unexpected ways, and the variants don't work the way they expect.
2. This is an issue we've discussed a lot. See #2594<#2594> #391<#391> #2492<#2492> #2644<#2644>. Currently, variant forwarding is used only in a few cases, I believe involving virtual dependencies. It is not generally used to forward standard variants to standard packages.
3. You can already do what you need, with the current infrastructure, in one of two ways:
a) Add to your packages.yaml:
packages:
lbann:
variants: [+debug]
elemental:
variants: [+debug]
b) Set the `debug` variant everywhere it's possible, with:
packages:
all:
variants: [+debug]
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#3049 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AF7Ce5_HbJP1Nxrh7miAOSHaBpx5fJ8nks5rbJdZgaJpZM4L5LVt>.
|
On Fri, Feb 10, 2017 at 12:32 PM, Brian Van Essen ***@***.***> wrote:
Building lbann with debugging and elemental without debugging creates a
nonsensical binary
What do you mean by nonsensical? I use plenty of binaries where some of
the packages were compiled with `-O2 -g` and others were just compiled with
`-O2`, and I'm able to trace stack traces through them all.
In any case... if the binary works, then it is conceivably something
someone might want to do, and I think the variant forwarding should be
removed. If it simply does not work, then I agree, the variant should be
forwarded.
|
LBANN relies quite heavily on Elemental. I have yet to encounter a bug in LBANN where I didn’t want to have the debugging turned on in Elemental. Specifically, many of the functional errors that occur in LBANN are related to mismatched matrix sizes and those are not reported if Elemental is not build with debug flags enabled. I am worried about ease of use of debugging LBANN in spack, if the default behavior for building a debugging version of LBANN does not force Elemental to have the proper debugging flags, then it will it harder for users to actually debug a problem. I though that the purpose of the spack package was to assemble all of the dependencies in coherent way so that the end user does not have to micromanage an installation.
Brian C. Van Essen
vanessen1@llnl.gov
(w) 925-422-9300
(c) 925-290-5470
… On Feb 10, 2017, at 9:41 AM, Elizabeth Fischer ***@***.***> wrote:
On Fri, Feb 10, 2017 at 12:32 PM, Brian Van Essen ***@***.***>
wrote:
> Building lbann with debugging and elemental without debugging creates a
> nonsensical binary
What do you mean by nonsensical? I use plenty of binaries where some of
the packages were compiled with `-O2 -g` and others were just compiled with
`-O2`, and I'm able to trace stack traces through them all.
In any case... if the binary works, then it is conceivably something
someone might want to do, and I think the variant forwarding should be
removed. If it simply does not work, then I agree, the variant should be
forwarded.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thank you for the explanation. My final conclusion is that the variant forwarding should be removed, for reasons already cited above. The case of LBANN /Elemental seems no different from that of many other packages, where one needs to set Please remember that Spack is evolving rapidly; we are thinking not just about how things are now, but about how today's PRs fit into tomorrow's planned work. We are planning on implementing #2380 (for shared/static and debug/release) as soon as #2386 is merged. We've seen attempts to forward debug flags in the past. Some people try forwarding top-down as you did; others want to forward bottom-up (i.e. LBANN would not have a With future PRs in mind, I would welcome suggestions you might have on how most conveniently set up debug builds in a general way. Right now, we have ways to set variants either for a single package, or for all packages in a DAG. What is the best way to conveniently configure debug variant for SOME packages in a DAG; for example, for the packages YOU are interested in building debug? I have no idea, but I'm open to suggestions. In the meantime, asking LBANN users to set two
Spack is great, but it's not magic. Before setting up an install, I typically scrutinize the output of You might find our work on Spack Environments to be relevant. A Spack Environment might be what you will want to distribute to your users: one that sets up for debug builds of your project, and one that sets up for production builds. Our notes so far are here: I'm sharing below my
I have another
I also have one that re-uses key system-installed packages:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@citibeth I appreciate your concern about variant forwarding. I think it's something that we should approach carefully, and should reconsider after the new concretizer that Todd is working on is in place. However, I think this is a compelling case for using forwarding. The semantics of it make sense -- lbann with debugging is useless when elemental does not have debug symbols.
I think this PR should go ahead as is. I will merge it when the tests return success, assuming no other change is made.
for versions that have known bugs.
I think all of @citibeth's advice is useful but I'm with @becker33 on similar grounds to our decision on conduit in #2670. This package isn't widely used, and we're trying to give the actual maintainers of LBANN some leeway here to get them into Spack and to best serve their users. I would really like to address making variants more sane after |
Please see #3131 for additional comments I have on this issue. |
* Creating a spack package for LLNL's LBANN (Livermore Big Artificial Neural Network) training toolkit. * Recipe for building LBANN toolkit. Contains limited feature set and is optimized for building with GNU gcc and OpenBLAS. * Removed unnecessary dependencies based on reviewers feedback. * Added support for the int64 data type in the Elemental library. This is required for supporting indices for large matrices. * Added a variant to force a sequential weight matrix initialization. This is slow, but provides an initialization that is independent of model parallelism. * Added a guard to prevent building Elemental with the Intel compiler for versions that have known bugs.
* Creating a spack package for LLNL's LBANN (Livermore Big Artificial Neural Network) training toolkit. * Recipe for building LBANN toolkit. Contains limited feature set and is optimized for building with GNU gcc and OpenBLAS. * Removed unnecessary dependencies based on reviewers feedback. * Added support for the int64 data type in the Elemental library. This is required for supporting indices for large matrices. * Added a variant to force a sequential weight matrix initialization. This is slow, but provides an initialization that is independent of model parallelism. * Added a guard to prevent building Elemental with the Intel compiler for versions that have known bugs.
* Creating a spack package for LLNL's LBANN (Livermore Big Artificial Neural Network) training toolkit. * Recipe for building LBANN toolkit. Contains limited feature set and is optimized for building with GNU gcc and OpenBLAS. * Removed unnecessary dependencies based on reviewers feedback. * Added support for the int64 data type in the Elemental library. This is required for supporting indices for large matrices. * Added a variant to force a sequential weight matrix initialization. This is slow, but provides an initialization that is independent of model parallelism. * Added a guard to prevent building Elemental with the Intel compiler for versions that have known bugs.
* Creating a spack package for LLNL's LBANN (Livermore Big Artificial Neural Network) training toolkit. * Recipe for building LBANN toolkit. Contains limited feature set and is optimized for building with GNU gcc and OpenBLAS. * Removed unnecessary dependencies based on reviewers feedback. * Added support for the int64 data type in the Elemental library. This is required for supporting indices for large matrices. * Added a variant to force a sequential weight matrix initialization. This is slow, but provides an initialization that is independent of model parallelism. * Added a guard to prevent building Elemental with the Intel compiler for versions that have known bugs.
Recipe for building Livermore Big Artificial Neural Network (LBANN) training toolkit.