Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature : customization of modules from configuration files #744

Merged
merged 39 commits into from
May 11, 2016

Conversation

alalazo
Copy link
Member

@alalazo alalazo commented Apr 5, 2016

Modifications :
  • during module file generation all the dependencies can add variables to run_env (previously it was only extendee)
  • it is possible to filter modifications to specific environment variables out of module files
  • autoload and/or prereq in tcl module files
  • autoload and/or prereq in dotkit module files
  • it is possible to customize the environment modifications in modules
  • it is possible to conditionally blacklist modules
  • it is possible to use a naming scheme like in Modules in directories and fixes for setup-env.csh #498 and to issue conflicts in tcl modules
  • regression tests on all the bugs found by @glennpj
Example : filter modifications out of module files

Modifications to certain environment variables in module files are generated by default. Suppose you would like to avoid having CPATH and LIBRARY_PATH modified by your dotkit modules. Then a user modules.yaml like:

modules:
  dotkit:
    all:
      filter:
        environment_blacklist: ['CPATH', 'LIBRARY_PATH']  # Exclude changes to any of these variables

will generate dotkit module files that will not contain modifications to either CPATH or LIBRARY_PATH and tcl module files that instead will contain those modifications.

Example : autoload direct dependencies in tcl module files

The following lines in modules.yaml:

modules:
  tcl:
    all:
      autoload: 'direct'

will produce tcl module files that will automatically load their direct dependencies. Adding prerequisites could be done in a similar way :

modules:
  tcl:
    all:
      prerequisites: 'direct'
Example : customize module file entries

It's possible to customize the entries written in module files. For instance with:

modules:
  tcl:
    all:
      environment:
        set: ['BAR,bar']
    ^openmpi:: # Note the double ':'
      environment:
        set: ['BAR,baz']
    zlib:
      environment:
        set: ['FOO,foo']
    zlib%gcc@4.8:
      environment:
        set: ['FOOBAR,foobar']

what will happen is :

  • every module will have a setenv BAR bar line
  • unless the module depends on openmpi in which case setenv BAR baz ('::' overrides previous rules, to be consistent with what happens for different configuration sections)
  • anything that matches zlib will have setenv FOO foo
  • anything that matches zlib%gcc@4.8 will have setenv FOOBAR foobar
Example : don't generate modules for things that are built with the system compiler

The following configuration file :

modules:
  tcl:
    whitelist: ['gcc', 'llvm']  # whitelist will have precedence over blacklist
    blacklist: ['%gcc@4.4.7']

will skip module file generation for anything that satisfies %gcc@4.4.7, with the exception of gcc and llvm.

Example : customize the naming scheme and insert conflicts

A configuration file like:

modules:
  tcl:
    naming_scheme: '{name}/{version}-{compiler.name}-{compiler.version}'
    all:
      conflict: ['{name}', 'intel/14.0.1']

will create module files that will conflict with intel/14.0.1 and with the base directory of the same module (i.e. you cannot have two versions of the same things loaded at the same time). The conflict directive currently accepts as items :

  • string literals
  • string "formats" that match part of the dirname in naming_scheme

@alalazo alalazo changed the title [WIP] feature : customization of module files from configuration files [WIP] feature : customization of modules from configuration files Apr 5, 2016
@alalazo alalazo changed the title [WIP] feature : customization of modules from configuration files feature : customization of modules from configuration files Apr 6, 2016
pass
# TODO : the code down below is quite similar to build_environment.setup_package and needs to be
# TODO : factored out to a single place
for item in dependencies(self.spec, 'All'):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgamblin Adding these three lines:

for item in self.spec.traverse(order='post', depth=True, cover='nodes', root=False):
    tty.msg(item)
tty.msg('')

immediately above here gives to me the following output for gcc@5.3.0 during spack module refresh:

==> (1, binutils@2.26%gcc@4.8+gold~krellpatch~libiberty=production)
==> (1, gmp@6.1.0%gcc@4.8=production)
==> (2, gmp@6.1.0%gcc@4.8=production)
==> (1, isl@0.14%gcc@4.8=production^gmp@6.1.0%gcc@4.8=production)
==> (2, gmp@6.1.0%gcc@4.8=production)
==> (3, gmp@6.1.0%gcc@4.8=production)
==> (2, mpfr@3.1.4%gcc@4.8=production^gmp@6.1.0%gcc@4.8=production)
==> (1, mpc@1.0.3%gcc@4.8=production^gmp@6.1.0%gcc@4.8=production^mpfr@3.1.4%gcc@4.8=production)
==> (2, gmp@6.1.0%gcc@4.8=production)
==> (1, mpfr@3.1.4%gcc@4.8=production^gmp@6.1.0%gcc@4.8=production)

The point that I want you to note is that we are visiting gmp@6.1.0%gcc@4.8=production multiple times. The behavior won't change if you remove depth=True and is not consistent with what happens at build time (where the nodes are visited only once).

I really don't get if we miss a call to some method before entering here (to collapse nodes that refers to the same spec) or if it is the way this is supposed to work.

@alalazo
Copy link
Member Author

alalazo commented Apr 6, 2016

@tgamblin @citibeth @adamjstewart @glennpj @nrichart (or anybody else that might be interested in this), I think I am ready to collect a first round of comments / questions on this PR. Fire at will 😄

'}}')

prerequisite_format = 'prereq {module_file}\n'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgamblin Do you know if you can do something similar to autoload_format and prerequisite_format with dotkit?

@adamjstewart
Copy link
Member

This looks very useful. We currently use SoftEnv at Argonne, but are planning to switch to some form of modules within the next year. Not sure which are currently supported by Spack or the pros and cons of each, but that would obviously influence our decision.

Questions:

  1. Are these module configurations going to be part of the main Spack repo, or will new users need to figure everything out for themselves? It would be great to keep them in Spack, so I can install and immediately load them with all the proper environment variables set.
  2. Will Spack be able to use these environment settings when trying to install other packages? The problems we were having with the Intel compilers comes to mind. Many of those problems can be solved by having the right library paths set by Spack.

@luigi-calori
Copy link
Contributor

This work sound really interesting for our case.
Looking at the code, It seems one can select automatic loading or requiring dependent modules .
I would like to test and see what else could be missing in order to produce modules similar to what we currently (hand) edit. ( header and footer configuration likely)
Is it ready to test? Can I just try to merge it with current develop branch?
Thanks for the work

@alalazo
Copy link
Member Author

alalazo commented Apr 6, 2016

@adamjstewart

This looks very useful. We currently use SoftEnv at Argonne, but are planning to switch to some form of modules within the next year. Not sure which are currently supported by Spack or the pros and cons of each, but that would obviously influence our decision.

Currently develop supports tcl and dotkit, PR #107 adds support for lmod.

Are these module configurations going to be part of the main Spack repo, or will new users need to figure everything out for themselves? It would be great to keep them in Spack, so I can install and immediately load them with all the proper environment variables set.

The features above are meant for user level configuration files while the one coming with spack will just enable the creation of both tcl and dotkit module files (without any customization). Anyhow, spack uses sensible lookup rules and the information provided by dependencies to create the default module files. In my experience this is what you want 99% of the time.

The customization covers some particular needs. For instance, at my site we don't want to have CPATH set in module files (whereas spack default is to set it). We'll solve the issue with a modules.yaml like:

modules::
  enable: ['tcl']
  tcl:
    all:
      filter:
        environment_blacklist: ['CPATH']

Another case is IntelMPI, where the fabrics in use are set at run-time using environment variables. You can add these using something like:

modules::
  enable: ['tcl']
  tcl:
    intelmpi=<your-architecture>:
      environment:
        set : [<set-the-environment-variables-here>]

that will modify only modules matching intelmpi=<your-architecture>.

Will Spack be able to use these environment settings when trying to install other packages? The problems we were having with the Intel compilers comes to mind. Many of those problems can be solved by having the right library paths set by Spack.

Not really, this PR is just meant to address module files creation. I wonder in fact if there are cases where you need per-site custom information during the installation phase (I can't think of any right now). The point with Intel compiler is a good one, but then I think that compilers are a bit different from normal packages in the sense that you don't depend on a compiler, and so far the mechanism used to inject customization in a package relies on traversing its dependencies.

@alalazo
Copy link
Member Author

alalazo commented Apr 6, 2016

@citibeth I forgot to mention that this:

modules::
  enable: ['tcl']
  tcl:
    ^python:
      autoload: 'all'

will be quite close to what you want to achieve with #721 : all the modules of packages that depend on python will autoload all their dependencies.

@glennpj
Copy link
Contributor

glennpj commented Apr 6, 2016

@alalazo This looks awesome at a high level. The flexibility of working with subsets of packages and dependencies will give spack features that other systems do not have, that I know of anyway.

@adamjstewart If you are looking to use an environment module system then I recommend Lmod. It can do everything that TCL modules can do plus more. It works with TCL module files through a converter so it works right now in spack.

@glennpj
Copy link
Contributor

glennpj commented Apr 9, 2016

@alalazo I pulled the PR and tried a spack module refresh after setting the following in modules.yaml:

tcl:
    all:
      autoload: direct

The modules that set any module loading are non-functional with Lmod.

module show python-2.7.11-gcc-5.3.0-zowdyyxtbfjheerrhpvewuri34noptxf 
Lmod has detected the following error: 
/home/gjohnson/spack/share/spack/modules/linux-x86_64/python-2.7.11-gcc-5.3.0-zowdyyxtbfjheerrhpvewuri34noptxf:
(python-2.7.11-gcc-5.3.0-zowdyyxtbfjheerrhpvewuri34noptxf): extra characters after close-brace 

Looking a the module file itself, there are no line breaks for the autoload statements. The following is all on one line:

if ![ is-loaded zlib-1.2.8-gcc-5.3.0-3cdrf2tpqqmo3qdo5hhjitio62iqgxed ] {    puts stderr "Autoloading zlib-1.2.8-gcc-5.3.0-3cdrf2tpqqmo3qdo5hhjitio62iqgxed"    module load zlib-1.2.8-gcc-5.3.0-3cdrf2tpqqmo3qdo5hhjitio62iqgxed}if ![ is-loaded ncurses-6.0-gcc-5.3.0-bo3dzyuwbnbwpdbqjupopedgwbuklmnt ] {    puts stderr "Autoloading ncurses-6.0-gcc-5.3.0-bo3dzyuwbnbwpdbqjupopedgwbuklmnt"    module load ncurses-6.0-gcc-5.3.0-bo3dzyuwbnbwpdbqjupopedgwbuklmnt}if ![ is-loaded sqlite-3.8.5-gcc-5.3.0-hhq5za5zica6omruoyrbw7g4scbzta2r ] {    puts stderr "Autoloading sqlite-3.8.5-gcc-5.3.0-hhq5za5zica6omruoyrbw7g4scbzta2r"    module load sqlite-3.8.5-gcc-5.3.0-hhq5za5zica6omruoyrbw7g4scbzta2r}if ![ is-loaded readline-6.3-gcc-5.3.0-752uupejxmzzzroligqszrejh74bs5r7 ] {    puts stderr "Autoloading readline-6.3-gcc-5.3.0-752uupejxmzzzroligqszrejh74bs5r7"    module load readline-6.3-gcc-5.3.0-752uupejxmzzzroligqszrejh74bs5r7}if ![ is-loaded openssl-1.0.2g-gcc-5.3.0-zg6cjmcacvleczjal5b2ktffwhfkxp4c ] {    puts stderr "Autoloading openssl-1.0.2g-gcc-5.3.0-zg6cjmcacvleczjal5b2ktffwhfkxp4c"    module load openssl-1.0.2g-gcc-5.3.0-zg6cjmcacvleczjal5b2ktffwhfkxp4c}if ![ is-loaded bzip2-1.0.6-gcc-5.3.0-alatbh7cqp2aobgcvg6zxrasrmfsdfdk ] {    puts stderr "Autoloading bzip2-1.0.6-gcc-5.3.0-alatbh7cqp2aobgcvg6zxrasrmfsdfdk"    module load bzip2-1.0.6-gcc-5.3.0-alatbh7cqp2aobgcvg6zxrasrmfsdfdk}prepend-path PATH "/home/gjohnson/spack/opt/spack/linux-x86_64/gcc-5.3.0/python-2.7.11-zowdyyxtbfjheerrhpvewuri34noptxf/bin"

@glennpj
Copy link
Contributor

glennpj commented Apr 9, 2016

I wonder if considering module naming should be part of this. I am referring to the actual names of the modules and not the layout. As mentioned in PR #498 the hash in the name is not human parseable. Since an environment module system is ultimately the interface end users have to the installed software the names need to be "friendly". If doing autoloading and prereqs then those names need to be listed in dependent module files.

@alalazo
Copy link
Member Author

alalazo commented Apr 10, 2016

@glennpj I still have to work the details of that, but as far as I can tell module 'layout' and module 'naming' should be the same thing. Currently the core part in constructing a module name is the function use_name, which returns something like:

return "%s-%s-%s-%s-%s" % (
    self.spec.name, self.spec.version,
    self.spec.compiler.name,
    self.spec.compiler.version,
    self.spec.dag_hash())

The idea I have in mind is to specify the various components from modules.yaml and being able to substitute the dash with a slash. An example along the line of #498 :

modules:
  tcl:
    naming: '{name}/{version}-{compiler-name}-{compiler-version}'

The things I still have to think about are basically how to handle file name collisions and how to deal with variants (e.g. you may want to have the concrete mpi in the name too, or something like that).

For collisions I believe that writing the full spec in a comment within the module file, and ask the user for permission to overwrite an existing module file will solve the problem. For variants I guess I'll need to give a try to some options and see which one seems to be the best. Does that make sense to you?

Regarding making changes to the naming scheme part of this PR, I would rather finalize what is already in place and submit another PR once #744 has been merged. What do you think @tgamblin ?

@alalazo
Copy link
Member Author

alalazo commented Apr 10, 2016

@glennpj Thanks for the bug report, I forgot the newlines when refactoring the multi-line strings 😄 I'll fix them asap

@alalazo
Copy link
Member Author

alalazo commented Apr 10, 2016

@glennpj Should be fixed now.

@coveralls
Copy link

coveralls commented May 10, 2016

Coverage Status

Coverage increased (+1.4%) to 64.586% when pulling 71e49e2 on epfl-scitas:custom_modulefiles_from_config into 1a563c2 on LLNL:develop.

@alalazo
Copy link
Member Author

alalazo commented May 10, 2016

@tgamblin The two changes are in. Now I miss docs : would you prefer to have them in a separate PR?

@tgamblin
Copy link
Member

@alalazo: same PR is ok with me!

@glennpj
Copy link
Contributor

glennpj commented May 11, 2016

@alalazo Would it be possible to add the ability to have module load directives specified in the modules.yaml file as described in #744 (comment)?

Thanks.

@tgamblin
Copy link
Member

@glennpj: for the example in the comment, I believe @alalazo's prereq/autoload support does what you want. Are you looking to also add certain system modules as prereqs? I think that would be useful but maybe it could be added in another PR?

@glennpj
Copy link
Contributor

glennpj commented May 11, 2016

@tgamblin What I am referring to is not covered by autoloading. Being able to add explicit load directives in the modules.yaml file would be useful for loading modules that are not built with spack. Spack does not know about them but the admins/users do.

It would also be useful when using extension activation. For example, py-numpy depends on openblas so will autoload the openblas module. However, if py-numpy is activated then the py-numpy module does not need to be loaded, just the python module does. However, the openblas module would not be autoloaded when the python module is loaded. This would definitely be a new PR to handle activation but being able to add those in modules.yaml would be a work-around.

I am more interested in the external, non-spack built, module case but it could certainly be another PR.

@tgamblin
Copy link
Member

@glennpj:

For example, py-numpy depends on openblas so will autoload the openblas module. However, if py-numpy is activated then the py-numpy module does not need to be loaded, just the python module does.

This one actually sounds like a bug -- the numpy binaries should have RPATHs for their dependencies and you shouldn't need to load openblas. In the activation case the goal is for things to "just work".

For the non-activated case, the dep management here should handle loading anything that needs a PYTHONPATH as well as other packages (though again they should be RPATH'd).

Either way, though, I see lots of clear use cases for the external, non-spack-built case. So I think that should go in eventually. I can't implement it at the moment though so up to @alalazo whether he wants to put it in here or leave it for another PR.

@citibeth
Copy link
Member

On Tue, May 10, 2016 at 11:34 PM, Todd Gamblin notifications@github.com
wrote:

@glennpj https://github.com/glennpj:

For example, py-numpy depends on openblas so will autoload the openblas
module. However, if py-numpy is activated then the py-numpy module does not
need to be loaded, just the python module does.

This one actually sounds like a bug -- the numpy binaries should have
RPATHs for their dependencies and you shouldn't need to load openblas.

I agree, it's a bug, Python extensions don't currently have RPATHs. See:

#935

I believe fixing the bug involves figuring out the right incantations with
Python's setuptools to get RPATH included.

-- Elizabeth

@glennpj
Copy link
Contributor

glennpj commented May 11, 2016

@citibeth I believe you meant issue #719.

@tgamblin Yes, the non-activated case is fine. Even with RPATH module loading is handy when a package needs environment variables of it dependencies to be set.

@alalazo
Copy link
Member Author

alalazo commented May 11, 2016

@tgamblin @glennpj docs are in. If you don't mind I would prefer to merge this and work on the feature request in another PR. I start to be a little bit scared by the size of this one...

@tgamblin
Copy link
Member

@alalazo: I empathize with your sentiment 😄

Thanks for adding the feature for @glennpj!

@@ -5,4 +5,14 @@
# although users can override these settings in their ~/.spack/modules.yaml.
# -------------------------------------------------------------------------
modules:
prefix_inspections: {
bin: ['PATH'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the comma at the end of the line here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see -- the whole thing is in braces. I don't think you have to do that if the dict is spread over multiple lines.

@glennpj
Copy link
Contributor

glennpj commented May 11, 2016

@alalazo Another PR is fine by me. I think this PR is awesome by the way!

Thanks.

@tgamblin tgamblin merged commit bb4b6c8 into spack:develop May 11, 2016
@tgamblin
Copy link
Member

Merged! Thanks to @alalazo for doing all this work!

@alalazo
Copy link
Member Author

alalazo commented May 11, 2016

@glennpj @tgamblin Thank you!

@alalazo alalazo deleted the custom_modulefiles_from_config branch May 11, 2016 17:20
matz-e pushed a commit to matz-e/spack that referenced this pull request Apr 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants