Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install fails looking for modulecmd #2924

Closed
tgamblin opened this issue Jan 24, 2017 · 20 comments
Closed

install fails looking for modulecmd #2924

tgamblin opened this issue Jan 24, 2017 · 20 comments

Comments

@tgamblin
Copy link
Member

A user reported this error.

It looks like load_module in build_enviroment.py will fail if modulecmd is not available.

@alalazo: Is it supposed to be possible to get to this code if the system supports modules? Seems like we should check for modulecmd up front. I haven't dug into this yet.

galaxy:~>spack -d -v mirror list                                                                      
==> Reading config file /local/cjn/.spack/linux/mirrors.yaml                                         
local_filesystem    file:///ROWAN/group/pts/spack_mirror/spack-mirror-2017-01-24

galaxy:~>spack -d -v install zlib
==> Reading config file /ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/etc/spack/defaults/packages.yaml
==> READ LOCK: /local/cjn/.spack/cache/providers/.builtin-index.yaml.lock[0:0] [Acquiring]
==> READ LOCK: /local/cjn/.spack/cache/providers/.builtin-index.yaml.lock[0:0] [Released]
==> Reading config file /local/cjn/.spack/linux/compilers.yaml
==> READ LOCK: /ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/opt/spack/.spack-db/prefix_lock[3332486004197984765:1] [Acquiring]
==> READ LOCK: /ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/opt/spack/.spack-db/prefix_lock[3332486004197984765:1] [Released]
==> Installing zlib
==> Error: AttributeError: 'NoneType' object has no attribute 'add_default_arg'
/ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/lib/spack/spack/build_environment.py:567, in child_execution:
     547      def child_execution(child_connection, input_stream):
     548          try:
     549              setup_package(pkg, dirty=dirty)
     550              function(input_stream)
     551              child_connection.send(None)
     552          except StopIteration as e:
     553              # StopIteration is used to stop installations
     554              # before the final stage, mainly for debug purposes
     555              tty.msg(e.message)
    556              child_connection.send(None)
     557          except:
     558              # catch ANYTHING that goes wrong in the child process
     559              exc_type, exc, tb = sys.exc_info()
     560  
     561              # Need to unwind the traceback in the child because traceback
     562              # objects can't be sent to the parent.
     563              tb_string = traceback.format_exc()
     564  
     565              # build up some context from the offending package so we can
     566              # show that, too.
  >> 567              package_context = get_package_context(tb)
     568  
     569              build_log = None
     570              if hasattr(pkg, 'log_path'):
     571                  build_log = pkg.log_path
     572  
     573              # make a pickleable exception to send to parent.
     574              msg = "%s: %s" % (str(exc_type.__name__), str(exc))
     575  
     576              ce = ChildError(msg, tb_string, build_log, package_context)
     577              child_connection.send(ce)
     578  
     579          finally:
     580              child_connection.close()


Traceback (most recent call last):
  File "/ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/lib/spack/spack/build_environment.py", line 549, in child_execution
    setup_package(pkg, dirty=dirty)
  File "/ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/lib/spack/spack/build_environment.py", line 492, in setup_package
    set_compiler_environment_variables(pkg, spack_env)
  File "/ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/lib/spack/spack/build_environment.py", line 227, in set_compiler_environment_variables
    load_module(mod)
  File "/ROWAN/group/prod/LINUX_WORKSTATION/acs/spack/spack-0.10.0/lib/spack/spack/build_environment.py", line 129, in load_module
    modulecmd.add_default_arg('python')
AttributeError: 'NoneType' object has no attribute 'add_default_arg'
@alalazo
Copy link
Member

alalazo commented Jan 24, 2017

@tgamblin I'll try to have a look. If I am not wrong this code was part of the support for Cray, and is reached either if a compiler has a module specified or if an external has a module specified.

@tgamblin
Copy link
Member Author

Maybe I should be pinging @becker33 and @mamelara then.

@mpbelhorn: I seem to recall you having issues where modulecmd wasn't available in the path, even when a system had modules. Could this be related?

@mpbelhorn
Copy link
Contributor

I think @alalazo is right about when load_module is hit. As @tgamblin mentioned we don't put modulecmd in the user's PATH by default (unlike at NERSC). However, to use spack on our Crays we do need to add modulecmd to the PATH.

We also have a non-Cray system that uses lmod for which we need to declare modules in both compilers.yaml and packages.yaml. On that system, we symlink modulecmd->lmod and also make sure this symlink is in the PATH. Doing that introduces other issues which I hope to fix properly at some point in the future by generalizing support for different module systems. Current we use a dirty fix to let symlinked modulecmd->lmod work as expected.

@DaanVanVugt
Copy link

I am also affected by this bug (on a CentOS 7 system, module command = module)
Wouldn't it be smart to make the modulecmd a configuration option?

@becker33
Copy link
Member

becker33 commented Feb 1, 2017

@mpbelhorn I thought that lmod calls modulecmd under the hood, is that not correct?

@Exteris Does your site's implementation of modules not call modulecmd as part of its execution? Or is the issue that modulecmd isn't in the user's path so Spack can't find it?

@tgamblin @alalazo You are correct about the origin of this piece of code. It should only be called when we already know we need modules, but I may have to think more about how to find modulecmd when it's not in the user's path.

@DaanVanVugt
Copy link

@becker33 module() is set into the users path as a function:

module () 
    eval `$TCLSH /marconi/prod/opt/environment/module/3.2.10/none/modulecmd.tcl bash $*`

Modules Release Tcl 3.1.6 ($RCSfile: modulecmd.tcl,v $ $Revision: 1.112 $)
	Copyright GNU GPL v2 1991

modulecmd.tcl is not in my path, and when I make a symlink modulecmd -> /marconi/..../modulecmd.tcl I get the error

==> Error: SyntaxError: invalid syntax (<string>, line 1)
/marconi/home/userexternal/dvanvugt/spack/lib/spack/spack/build_environment.py:567, in child_execution:

when using external openmpi.

@DaanVanVugt
Copy link

I am currently trying to create a small wrapper for my module system.
modulecmd.tcl bash $* returns: . /tmp/modulescript_44581_0
If I output the contents of this file (create a modulecmd as below:)

#!/bin/bash
file=`$TCLSH /marconi/prod/opt/environment/module/3.2.10/none/modulecmd.tcl bash $* | tr -d '. '`
cat $file

output like this is produced:

/bin/rm -f /tmp/modulescript_44581_0
LD_LIBRARY_PATH="/cineca/prod/opt/compilers/intel/pe-xe-2017/binary/itac_2017/lib:/cineca/prod/opt/compilers/intel/pe-xe-2017/binary/inspector_2017/lib64:/cineca/prod/opt/compilers/intel/pe-xe-2017/binary/lib/intel64:/cineca/prod/opt/compilers/openmpi/1-10.3/gnu--6.1.0/lib:/cineca/prod/opt/compilers/gnu/6.1.0/none/lib64"; export LD_LIBRARY_PATH

and spack hangs on spack install.
What kind of output does spack expect here?

@luigi-calori
Copy link
Contributor

Hi @Exteris ... welcome on Marconi, think I had the same problem on another CINECA cluster
( I had a spack instance installing on Marconi, but did not hit the problem as I used GNU compilers)
I think this happens because the module system installed at CINECA is pure tcl and so modulecmd is not an executable but a tcl script. So likely if you try to symlink modulecmd to modulecmd.tcl .. is not working

@DaanVanVugt
Copy link

Hi @luigi-calori Thanks :)
I created a wrapper script which is executable, so I think modulecmd is executing okay.
Now I have this spack spec scotch+esmumps+mpi ^openmpi is waiting forever (or at least very long)
with relevant packages.yaml:

packages:
  openmpi:
    modules:
      openmpi@1.10.3%gcc@6.1.0: openmpi/1-10.3--gnu--6.1.0
    buildable: False

@luigi-calori
Copy link
Contributor

likely is the which('modulecmd') thas can not work with this kind of pure tcl modules....
It's unclear to me why you need to use openmpi/1-10.3--gnu--6.1.0 if you want to build with Intel.
As far as I have understood, it would be good to use intelmpi with intel compiler.
I'm not in Cineca now, but I can ask advice to my colleagues of the user support....
Did you try to compile with gcc? ... I think it should be possible to wrap already installed compilers even without loading modules, it could be enough to define the same env var the module define.

@DaanVanVugt
Copy link

Ah, I have made a mistake here. The wrapper only needs to be:

#!/bin/bash
$TCLSH /marconi/prod/opt/environment/module/3.2.10/none/modulecmd.tcl $*

which outputs:

modulecmd python show openmpi/1-10.3--gnu--6.1.0
-------------------------------------------------------------------
/cineca/prod/opt/modulefiles/base/compilers/openmpi/1-10.3--gnu--6.1.0:

prereq	gnu/6.1.0
conflict	openmpi
setenv	OPENMPI_HOME	/cineca/prod/opt/compilers/openmpi/1-10.3/gnu--6.1.0
setenv	MPICC	mpicc
setenv	MPICXX	mpicxx
setenv	MPIF90	mpif90
setenv	MPIF77	mpif77
setenv	MPIFC	mpifort
prepend-path	PATH	/cineca/prod/opt/compilers/openmpi/1-10.3/gnu--6.1.0/bin	:
prepend-path	LD_LIBRARY_PATH	/cineca/prod/opt/compilers/openmpi/1-10.3/gnu--6.1.0/lib	:
prepend-path	MANPATH	/cineca/prod/opt/compilers/openmpi/1-10.3/gnu--6.1.0/man	:
module-whatis	OpenMPI
-------------------------------------------------------------------

and yields a SyntaxError: invalid syntax (<string>, line 1)

@DaanVanVugt
Copy link

DaanVanVugt commented Feb 1, 2017

@luigi-calori I have put modulecmd in a folder in my path, so that works.
I'd like to compile both openmpi(%gcc) and intelmpi(%intel) versions of scotch, mumps, pastix, as we are having some trouble with intelmpi in pastix.

Mostly it is just me trying to use spack with marconi provided modules.
I think I've made some progress by removing the line with -------!

New script:

#!/bin/bash
$TCLSH /marconi/prod/opt/environment/module/3.2.10/none/modulecmd.tcl $* 2>&1 | tail -n +2 1>&2

@pramodk
Copy link
Contributor

pramodk commented Feb 1, 2017

@Exteris : I saw same error on Marconi (just getting started on this system). Once you got all working, it will be very helpful if you summarise issues you encountered and workaround/fixes.

@luigi-calori
Copy link
Contributor

MM... interesting, let me know how it proceeds as I' m as well interested l in building things with spack on Cineca clusters, specifically Marconi.
I would really like Spack become more used in module set-up...
If you publish your packages.yaml as well as compiler.yaml, would be really good.

For gcc6 I' ve tried to use
https://github.com/RemoteConnectionManager/RCM_spack_deploy/blob/master/recipes/hosts/marconi/config/compilers.yaml

that not using modules... the modulecmd wrapper is a smart idea... even if a hacky one

@DaanVanVugt
Copy link

I've been able to install scotch just now, with the script in the post above named modulecmd and available in path.
I'll post my packages.yaml below.

packages:
  openmpi:
    modules:
      openmpi@1.10.3%gcc@6.1.0: openmpi/1-10.3--gnu--6.1.0
    buildable: False
  intelmpi:
    modules:
      intelmpi@2017.1.132%intel@17.0.1: intelmpi/2017--binary
    buildable: False
  cmake:
    modules:
      cmake@3.5.2: cmake/3.5.2
    buildable: False
  netlib-scalapack:
    modules:
      netlib-scalapack@2.0.2^intelmpi: scalapack/2.0.2--intelmpi--2017--binary
      netlib-scalapack@2.0.2^openmpi: scalapack/2.0.2--openmpi--1-10.3--intel--pe-xe-2017--binary
    buildable: False
  openblas:
    modules:
      openblas@3.6.0%gcc@6.1.0: blas/3.6.0--gnu--6.1.0
    buildable: False
  intel-mkl:
    modules:
      intel-mkl@2017.1.132%intel: mkl/2017--binary
    buildable: False
  bison:
    paths:
      bison@2.7: /usr/bin/bison
    buildable: False
  flex:
    paths:
      flex@2.5.37: /usr/bin/flex
    buildable: False
  zlib:
    modules:
      zlib@1.2.8%gcc@6.1.0: zlib/1.2.8--gnu--6.1.0
    buildable: False
  hwloc:
    modules:
      hwloc@1.11.3: hwloc/1.11.3--gnu--6.1.0
    buildable: False
  all:
    providers:
      mpi: [openmpi, intelmpi]

I'm running into some unrelated issues with the scotch package that I'll create a separate report for.

@luigi-calori
Copy link
Contributor

Thanks, very useful info, keep us updated.
regarding modulecmd wrapping, maybe spack guru could comment:
maybe it could be possible to add a test in
https://github.com/LLNL/spack/blob/develop/lib/spack/spack/build_environment.py#L129
after modulecmd=which ('modulecmd') that, in case of it being null, tries something like:
modulecmd=which ('$TCLSH ${MODULESHOME}/modulecmd.tcl $*)

Which speck an compiler.yaml are you using? so maybe we can exercise and eventual PR in identical conditions as yours

@DaanVanVugt
Copy link

spack.yaml and compiler.yaml I have not altered.
I agree that your suggestion for modulecmd would be useful.
We'd need to check for presence of $TCLSH and ${MODULESHOME} (and call .... python $*)

@becker33
Copy link
Member

becker33 commented Feb 1, 2017

Capturing modulecmd is not quite as simple as @luigi-calori suggested, because sometimes modulecmd is not in the path on systems that use an executable instead of a tcl script. The solution needs to read the shell environment to determine what that system executes when a user types module <command>, which is what I'm working on now.

Unless anyone objects, I'm pursuing a solution that involves parsing typeset -f in a bash environment.

@Exteris your environment appears to be csh. Is bash present on your machines, and if so am I correct in assuming that the output for typeset -f includes the following?

module ()
{
    eval `$TCLSH /marconi/prod/opt/environment/module/3.2.10/none/modulecmd.tcl bash $*`
}

@cnelson3
Copy link

cnelson3 commented Feb 1, 2017

Hi All, thanks for looking into this for me (I was the user who 'reported' it to Todd) I'm very new to Spack, all this help and advice is most welcome.

@becker33 - we use bash - and yes, typeset -f includes an eval line similar:

module ()
{
    eval `/aaa/Modules/bin/modulecmd.tcl sh $*`
}

@alalazo
Copy link
Member

alalazo commented Feb 1, 2017

@becker33 Just in case it may be useful and you were not aware of it: #2426

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants