Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spack builds depended on user modules #27124

Open
3 tasks done
shahzebsiddiqui opened this issue Nov 1, 2021 · 6 comments
Open
3 tasks done

spack builds depended on user modules #27124

shahzebsiddiqui opened this issue Nov 1, 2021 · 6 comments
Labels
bug Something isn't working cray impact-low nersc

Comments

@shahzebsiddiqui
Copy link
Contributor

shahzebsiddiqui commented Nov 1, 2021

Steps to reproduce

I have noticed spack builds are affected based on modules loaded in your environment. To reproduce this issue here is a simple spack.yaml with a compiler definition for gcc on Perlmutter using the cray wrappers but using modules to load the appropriate environment.

siddiq90@login37> cat spack.yaml 
# This is a Spack Environment file.
#
# It describes a set of packages to be installed, along with
# configuration settings.
spack:
  # add package specs to the `specs` list
  specs: []
  compilers:
   - compiler:
      spec: gcc@11.2.0
      paths:
        cc: cc
        cxx: CC
        f77: ftn
        fc: ftn
      operating_system: sles15
      modules:
      - PrgEnv-gnu/8.1.0
      - gcc/11.2.0
      - cuda/11.3.0
      - libfabric/1.11.0.4.79

  view: true

Next you can create the environment wherever you have the spack.yaml

spack env create .

Here is the error i get when i install libsigsegv when i dont have any modules loaded

siddiq90@login37> module list
No modules loaded
siddiq90@login37> spack install libsigsegv
==> Installing libsigsegv-2.13-672qbf4xgvwea24ylbzn6ow32qlwl236
==> No binary for libsigsegv-2.13-672qbf4xgvwea24ylbzn6ow32qlwl236 found: installing from source
==> Error: CompilerAccessError: Compiler 'gcc@11.2.0' has executables that are missing or are not executable: ['CC', 'ftn', 'ftn']

/global/common/software/spackecp/perlmutter/e4s-21.08/spack/lib/spack/spack/build_environment.py:1029, in _setup_pkg_and_run:
       1026        tb_string = traceback.format_exc()
       1027
       1028        # build up some context from the offending package so we can
  >>   1029        # show that, too.
       1030        package_context = get_package_context(tb)
       1031
       1032        logfile = None

It was not able to find the cray wrappers. Then i try with module restore which loads the startup modules that are provided by Cray and i am able to install libsigsegv

siddiq90@login37> module restore
Resetting modules to system default. Reseting $MODULEPATH back to system default. All extra directories will be removed from $MODULEPATH.
siddiq90@login37> module list

Currently Loaded Modules:
  1) craype-x86-rome                                   6) nvidia/21.7           (g,c)   11) PrgEnv-nvidia/8.1.0 (cpe)
  2) libfabric/1.11.0.4.79                             7) craype/2.7.10         (c)     12) cray-pmi/6.0.13
  3) craype-network-ofi                                8) cray-dsmml/0.2.1              13) cray-pmi-lib/6.0.13
  4) perftools-base/21.09.0                    (dev)   9) cray-mpich/8.1.9      (mpi)
  5) xpmem/2.2.40-7.0.1.0_3.1__g1d7a24d.shasta        10) cray-libsci/21.08.1.2 (math)

  Where:
   g:     built for GPU
   mpi:   MPI Providers
   cpe:   Cray Programming Environment Modules
   math:  Mathematical libraries
   c:     Compiler
   dev:   Development Tools and Programming Languages

 

siddiq90@login37> spack install libsigsegv
==> Installing libsigsegv-2.13-672qbf4xgvwea24ylbzn6ow32qlwl236
==> No binary for libsigsegv-2.13-672qbf4xgvwea24ylbzn6ow32qlwl236 found: installing from source
==> Fetching https://mirror.spack.io/_source-cache/archive/be/be78ee4176b05f7c75ff03298d84874db90f4b6c9d5503f0da1226b3a3c48119.tar.gz
==> No patches needed for libsigsegv
==> libsigsegv: Executing phase: 'autoreconf'
==> libsigsegv: Executing phase: 'configure'
==> libsigsegv: Executing phase: 'build'
==> libsigsegv: Executing phase: 'install'
==> libsigsegv: Successfully installed libsigsegv-2.13-672qbf4xgvwea24ylbzn6ow32qlwl236
  Fetch: 0.66s.  Build: 10.98s.  Total: 11.64s.
[+] /global/common/software/spackecp/perlmutter/e4s-21.08/spack/opt/spack/cray-sles15-zen3/gcc-11.2.0/libsigsegv-2.13-672qbf4xgvwea24ylbzn6ow32qlwl236

I am using the spack branch https://github.com/spack/spack/tree/e4s-21.08 though i think this can be reproduced regardless of spack versions.

Error message

No response

Information on your system

siddiq90@login37> spack debug report
* **Spack:** 0.16.2-3949-8831cb2eed
* **Python:** 3.6.12
* **Platform:** cray-sles15-zen3
* **Concretizer:** original

General information

  • I have run spack debug report and reported the version of Spack/Python/Platform
  • I have searched the issues of this repo and believe this is not a duplicate
  • I have run the failing commands in debug mode and reported the output
@shahzebsiddiqui shahzebsiddiqui added bug Something isn't working triage The issue needs to be prioritized labels Nov 1, 2021
@shahzebsiddiqui
Copy link
Contributor Author

i realized there is an issue with loading modules after purging and we dont know how cray-mpich can be loaded after purging this means that doing spack builds with module purge has different behavior than running with active modules in environment.

siddiq90@login37> module purge
siddiq90@login37> module load PrgEnv-gnu/8.1.0 gcc/11.2.0 cuda/11.3.0 libfabrics/1.11.0.4.79
Lmod has detected the following error:  The following module(s) are unknown: "libfabrics/1.11.0.4.79" "cray-mpich"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore-cache load "libfabrics/1.11.0.4.79" "cray-mpich"

Also make sure that all modulefiles written in TCL start with the string #%Module

@shahzebsiddiqui
Copy link
Contributor Author

After doing some troubleshooting the way to get cray-mpich or any PrgEnv-* modules to load was to load craype-network-ofi which has path to cray-mpich module

siddiq90@login37> module purge
msiddiq90@login37> module load craype-network-ofi PrgEnv-nvidia
siddiq90@login37> ml

Currently Loaded Modules:
  1) libfabric/1.11.0.4.79   3) nvidia/21.7   (g,c)   5) cray-dsmml/0.2.1         7) cray-libsci/21.08.1.2 (math)
  2) craype-network-ofi      4) craype/2.7.10 (c)     6) cray-mpich/8.1.9 (mpi)   8) PrgEnv-nvidia/8.1.0   (cpe)

  Where:
   g:     built for GPU
   mpi:   MPI Providers
   cpe:   Cray Programming Environment Modules
   math:  Mathematical libraries
   c:     Compiler

The cray-mpich modulepath is in /opt/cray/pe/lmod/modulefiles/comnet/nvidia/20/ofi/1.0

siddiq90@login37> ml -t av cray-mpich
/opt/cray/pe/lmod/modulefiles/comnet/nvidia/20/ofi/1.0:
cray-mpich-abi/8.1.9
cray-mpich/8.1.9

This is coming from this modulefile

siddiq90@login37> module --redirect show craype-network-ofi | grep MODULEPATH
prepend_path("MODULEPATH","/opt/cray/pe/lmod/modulefiles/net/ofi/1.0")
prepend_path("MODULEPATH","/opt/cray/pe/lmod/modulefiles/comnet/nvidia/20/ofi/1.0")

@shahzebsiddiqui
Copy link
Contributor Author

i have reported this issue to Cray and referenced them to this issue for feedback. I would be curious to know if other sites are experiencing similar issues with spack builds when running spack install with module purge and one with startup modules.

@alalazo
Copy link
Member

alalazo commented Apr 7, 2023

Why was this marked as "impact-high"?

@alalazo alalazo added cray and removed triage The issue needs to be prioritized labels Apr 7, 2023
@shahzebsiddiqui
Copy link
Contributor Author

I think i labelled this with several tags including impact-high. I know its a known issue and i can confirm this is still present on Perlmutter with spack @develop branch

@alalazo
Copy link
Member

alalazo commented Nov 28, 2023

I think i labelled this with several tags including impact-high

It was indeed, then - after several months of not obtaining a reply on #27124 (comment) and having no further activity - I demoted that to impact-low. Feel free to restore to impact-high if it seems more appropriate.

Usually I tend to give "impact-high" to issues that need an immediate resolution, because they affect in a severe way lot of users on multiple platforms (but we have no written guidelines on that, so ymmv)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cray impact-low nersc
Projects
Status: Todo
Development

No branches or pull requests

2 participants